Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce UserDefinedLogicalNodeUnparser for User-defined Logical Plan unparsing #13880

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

goldmedal
Copy link
Contributor

@goldmedal goldmedal commented Dec 22, 2024

Which issue does this PR close?

Closes #13753 .

Rationale for this change

See the previous discussion for the design: #13753 (comment)

UserDefinedLogicalNodeUnparser provides two APIs for user-defined behavior:

  • unparse: Unparse the custom logical node to SQL within a statement.
  • unparse_to_statement: Unparse the custom logical node to a statement.

What changes are included in this PR?

  • Introduce UserDefinedLogicalNodeUnparser
  • Make the AST builders public
  • Add examples for unparsing custom logical plan

Are these changes tested?

yes

Are there any user-facing changes?

New trait and API

@goldmedal goldmedal marked this pull request as ready for review December 22, 2024 12:23
@goldmedal
Copy link
Contributor Author

Could @phillipleblanc or @sgrebnov take a look at this? Thanks

/// The child unparsers are called iteratively.
/// There are two methods in [`Unparser`] will be called:
/// - `extension_to_statement`: This method is called when the custom logical node is a custom statement.
/// If multiple child unparsers return a non-None value, the last unparsing result will be returned.
Copy link
Member

@sgrebnov sgrebnov Dec 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@goldmedal - I'm not sure using the last unparsing result is the expected behavior. As a user, I would expect to get the result from the first udlp_unparser that supports this node and stop checking the remaining udlp_unparsers instead.

Is there a specific use case / reason for using the last supported udlp_unparser? They can be dynamically registered and the last one should override perviously registered? To match unparse behavior where we don't know/track if unparsing is applied so we always apply all?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I would also expect this to short-circuit and have the first one win.

select: &mut Option<&mut SelectBuilder>,
relation: &mut Option<&mut RelationBuilder>,
) -> Result<()> {
for unparser in &self.udlp_unparsers {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be good to add indication that unparse applied to be consistent with unparse_to_statement and throw error if non of registered udlps applied / successfully processed the node.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, if none of the extension unparsers were able to process it, it should throw an error IMO

@sgrebnov
Copy link
Member

@goldmedal - looks great, two minor questions/comments

Copy link
Contributor

@phillipleblanc phillipleblanc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @goldmedal! I have a few minor comments, but this this a good upgrade for the unparser!

/// If multiple child unparsers return a non-None value, the last unparsing result will be returned.
/// - `extension_to_sql`: This method is called when the custom logical node is part of a statement.
/// If multiple child unparsers are registered for the same custom logical node, all of them will be called in order.
pub fn with_udlp_unparsers(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a fan of this name - udlp takes effort to understand what it means. How about renaming udlp_* to extension_*? i.e. with_extension_unparsers. It conveys the same meaning in an easier to understand way.

/// The child unparsers are called iteratively.
/// There are two methods in [`Unparser`] will be called:
/// - `extension_to_statement`: This method is called when the custom logical node is a custom statement.
/// If multiple child unparsers return a non-None value, the last unparsing result will be returned.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I would also expect this to short-circuit and have the first one win.

/// - `extension_to_statement`: This method is called when the custom logical node is a custom statement.
/// If multiple child unparsers return a non-None value, the last unparsing result will be returned.
/// - `extension_to_sql`: This method is called when the custom logical node is part of a statement.
/// If multiple child unparsers are registered for the same custom logical node, all of them will be called in order.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should also short-circuit and only do the first one?

select: &mut Option<&mut SelectBuilder>,
relation: &mut Option<&mut RelationBuilder>,
) -> Result<()> {
for unparser in &self.udlp_unparsers {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, if none of the extension unparsers were able to process it, it should throw an error IMO

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also rename this file to extension_unparser.rs

if let Some(err) = plan_to_sql(&plan).err() {
assert_eq!(
err.to_string(),
"External error: `relation` must be initialized"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error is expected?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sql SQL Planner
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support unparsing LogicalPlan::Extension to SQL tesxt
3 participants